Our first visualization will be an interactive map, containing various locations of gas stations from the contential United States. Each gas station will be a point on the map, and each point will have a hover text box containing state, county, address, and zip code.
Since our source data contains nearly 73,000 observations, we will randomly select 500 observations.
gas<-read.csv("https://pengdsci.github.io/datasets/POC/POC.csv")
rand.df <- gas[sample(nrow(gas), size=500), ]
We will use the plotly package to map the data onto an
interacitve map.
g <- list( scope = 'p',
projection = list(type = 'albers usa'),
showland = TRUE,
landcolor = toRGB("gray95"),
subunitcolor = toRGB("gray85"),
countrycolor = toRGB("gray85"),
countrywidth = 0.5,
subunitwidth = 0.5
)
fig <- plot_geo(rand.df, lat = ~ycoord, lon = ~xcoord) %>%
add_markers( text = ~paste(STATE, county, ADDRESS, ZIPnew,
sep = "<br>"),
#color = ,
symbol = "circle",
#size = ,
hoverinfo = "text") %>%
layout( title = 'Randomly Selected US Gas Stations',
geo = g )
fig
The above is a demonstration of a simple - yet effective - way to visualize data on a map. Our next will be a little more complex.
For this visualization, we will begin with a dataset of crimes committed in the great city of Philadelphia between 2015 and early March of 2024.
We will need to subset the 2023 data before we can impose it over a
map. We will use the stringr library for this.
crime<-read.csv("https://pengdsci.github.io/STA553VIZ/w08/PhillyCrimeSince2015.csv")
df<-crime
df$year <- str_extract(df$date, "\\d{4}")
df$year <- as.numeric(df$year)
crime23<-subset(df, year==2023)
write.csv(crime23, "C:\\Users\\Alex\\Documents\\R\\Grad\\553\\datasets\\wk7.csv")
A copy of the 2023 data can be found at https://raw.githubusercontent.com/AlexDragonetti/STA553/main/hw7/wk7.csv
#remove observations with missing values - at least one has a missing value for coordinates
crime23.nona<-na.omit(crime23)
Finally, we will map the incident data using the leaflet
package.
color2 <- rep("red", length(crime23.nona))
color2[which(crime23.nona$fatal=="Nonfatal")] <- "blue"
color2[which(crime23.nona$fatal=="Fatal")] <- "red"
label.msg <- paste("Street:", crime23.nona$street_name,
"<br>Block Number:",crime23.nona$block_number,
"<br>Neighborhood:", crime23.nona$neighborhood,
"<br>Incident Type:", crime23.nona$fatal)
leaflet(crime23.nona) %>%
addTiles() %>%
setView(lng=mean(crime23.nona$lng), lat=mean(crime23.nona$lat), zoom = 11) %>%
addProviderTiles(providers$Esri.WorldGrayCanvas) %>%
addCircleMarkers(
~lng,
~lat,
color = color2,
stroke = FALSE,
fillOpacity = 0.5,
popup= ~label.msg) %>%
addLegend(position = "bottomright",
colors = c("red", "blue"),
labels= c("Fatal", "Nonfatal"),
title= "Type of Incident",
opacity = 0.4)
Our resulting graph is fully interactive - clicking a dot will show details of the incident.
Please note that the above graph has spots that appear purple. This
is due to the opaque, overlapping red and blue dots, indicating both
fatal and nonfatal incidents at the same address. For example, 1000 E
Bristol Street in Juniata saw an incident with two victims, one being a
fatality. For clarification on any confusing point, please refer to the
dataset linked at the end of Preparing the Data.